400 research outputs found

    Use the Detection Transformer as a Data Augmenter

    Full text link
    Detection Transformer (DETR) is a Transformer architecture based object detection model. In this paper, we demonstrate that it can also be used as a data augmenter. We term our approach as DETR assisted CutMix, or DeMix for short. DeMix builds on CutMix, a simple yet highly effective data augmentation technique that has gained popularity in recent years. CutMix improves model performance by cutting and pasting a patch from one image onto another, yielding a new image. The corresponding label for this new example is specified as the weighted average of the original labels, where the weight is proportional to the area of the patches. CutMix selects a random patch to be cut. In contrast, DeMix elaborately selects a semantically rich patch, located by a pre-trained DETR. The label of the new image is specified in the same way as in CutMix. Experimental results on benchmark datasets for image classification demonstrate that DeMix significantly outperforms prior art data augmentation methods including CutMix.Comment: 13 page

    Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

    Get PDF
    Due to its causal semantics, Bayesian networks (BN) have been widely employed to discover the underlying data relationship in exploratory studies, such as brain research. Despite its success in modeling the probability distribution of variables, BN is naturally a generative model, which is not necessarily discriminative. This may cause the ignorance of subtle but critical network changes that are of investigation values across populations. In this paper, we propose to improve the discriminative power of BN models for continuous variables from two different perspectives. This brings two general discriminative learning frameworks for Gaussian Bayesian networks (GBN). In the first framework, we employ Fisher kernel to bridge the generative models of GBN and the discriminative classifiers of SVMs, and convert the GBN parameter learning to Fisher kernel learning via minimizing a generalization error bound of SVMs. In the second framework, we employ the max-margin criterion and build it directly upon GBN models to explicitly optimize the classification performance of the GBNs. The advantages and disadvantages of the two frameworks are discussed and experimentally compared. Both of them demonstrate strong power in learning discriminative parameters of GBNs for neuroimaging based brain network analysis, as well as maintaining reasonable representation capacity. The contributions of this paper also include a new Directed Acyclic Graph (DAG) constraint with theoretical guarantee to ensure the graph validity of GBN.Comment: 16 pages and 5 figures for the article (excluding appendix

    Adversarial Feature Stacking for Accurate and Robust Predictions

    Full text link
    Deep Neural Networks (DNNs) have achieved remarkable performance on a variety of applications but are extremely vulnerable to adversarial perturbation. To address this issue, various defense methods have been proposed to enhance model robustness. Unfortunately, the most representative and promising methods, such as adversarial training and its variants, usually degrade model accuracy on benign samples, limiting practical utility. This indicates that it is difficult to extract both robust and accurate features using a single network under certain conditions, such as limited training data, resulting in a trade-off between accuracy and robustness. To tackle this problem, we propose an Adversarial Feature Stacking (AFS) model that can jointly take advantage of features with varied levels of robustness and accuracy, thus significantly alleviating the aforementioned trade-off. Specifically, we adopt multiple networks adversarially trained with different perturbation budgets to extract either more robust features or more accurate features. These features are then fused by a learnable merger to give final predictions. We evaluate the AFS model on CIFAR-10 and CIFAR-100 datasets with strong adaptive attack methods, which significantly advances the state-of-the-art in terms of the trade-off. Without extra training data, the AFS model achieves a benign accuracy improvement of 6% on CIFAR-10 and 9% on CIFAR-100 with comparable or even stronger robustness than the state-of-the-art adversarial training methods. This work demonstrates the feasibility to obtain both accurate and robust models under the circumstances of limited training data

    ProtoDiv: Prototype-guided Division of Consistent Pseudo-bags for Whole-slide Image Classification

    Full text link
    Due to the limitations of inadequate Whole-Slide Image (WSI) samples with weak labels, pseudo-bag-based multiple instance learning (MIL) appears as a vibrant prospect in WSI classification. However, the pseudo-bag dividing scheme, often crucial for classification performance, is still an open topic worth exploring. Therefore, this paper proposes a novel scheme, ProtoDiv, using a bag prototype to guide the division of WSI pseudo-bags. Rather than designing complex network architecture, this scheme takes a plugin-and-play approach to safely augment WSI data for effective training while preserving sample consistency. Furthermore, we specially devise an attention-based prototype that could be optimized dynamically in training to adapt to a classification task. We apply our ProtoDiv scheme on seven baseline models, and then carry out a group of comparison experiments on two public WSI datasets. Experiments confirm our ProtoDiv could usually bring obvious performance improvements to WSI classification.Comment: 12 pages, 5 figures, and 3 table

    Catalytic Asymmetric Reactions between Alkenes and Aldehydes

    Get PDF
    This doctoral work describes catalytic asymmetric reactions between alkenes and aldehydes, enabled by the development of chiral Brønsted acids. Valuable and functionalized enantiomerically enriched cyclic compounds were efficiently furnished from inexpensive and commercially available reagents with high degrees of atom economy. In the first part of this thesis, the first highly enantioselective organocatalytic intramolecular carbonyl−ene cyclization of olefinic aldehydes is presented. In the second part, asymmetric cyclizations via oxocarbenium ions are described. One is a general asymmetric catalytic Prins cyclization of aldehydes with homoallylic alcohols, in which the oxocarbenium ion is attacked intramolecularly by a pendent alkene. The other one is an asymmetric oxa-Pictet−Spengler reaction between aldehydes and homobenzyl alcohols, in which the oxocarbenium ion is trapped by an intramolecular arene. The first general asymmetric [4+2]-cycloaddition of simple and unactivated dienes with aldehydes is developed in the last part of this thesis. This methodology is extremely robust and scalable. Valuable enantiomerically enriched dihydropyran compounds could be readily obtained from inexpensive and abundant dienes and aldehydes. New types of confined Brønsted acids were rationally designed and synthesized, including imino-imidodiphosphates (iIDPs), nitrated imidodiphosphates (nIDPs), and imidodiphosphorimidates (IDPis). Beyond the application of these catalysts in various asymmetric reactions between simple alkenes and aldehydes, mechanistic investigations are also disclosed in this doctoral work

    The Determinants of Going Concern Audit Opinions: Evidence from Shanghai Stock Exchange over 2009 to 2011

    Get PDF
    This research examines the factors which can affect auditors to issue going concern audit opinions in the Chinese stock market, and furthermore, to discuss the audit quality among Chinese reporting system through the issuance of going concern opinions. Firstly, the data results illustrate that Chinese auditors are more likely to issue going concern audit opinions to those listed companies with poor profitability, low liquidity, less cash inflows and high leverage. Secondly, it can be examined that audit fee paid by a client has no statistically significant effect on impacting auditors to give going concern audit opinions. Finally, comparing with Big-Four and non-Big four audit firms, no significant difference between two types of auditors on which one have more preference to issue going concern audit opinions is found. Generally, the research shows a mature audit profession in developing countries like China is being formed, that audit independence is not compromised by economic dependence and the majority of Chinese auditors mainly issue going concern opinions based on financial distress indicators. Besides, this research paper indicates additional discussion about improvements on audit quality and independence since the new audit standard for going concern is promulgated in 2003. Key Words: going concern audit opinions, risk of bankruptcy, fee independence, size of audit firms, improvements on audit quality and independenc

    Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder

    Full text link
    Medical Visual Question Answering (VQA) systems play a supporting role to understand clinic-relevant information carried by medical images. The questions to a medical image include two categories: close-end (such as Yes/No question) and open-end. To obtain answers, the majority of the existing medical VQA methods relies on classification approaches, while a few works attempt to use generation approaches or a mixture of the two. The classification approaches are relatively simple but perform poorly on long open-end questions. To bridge this gap, in this paper, we propose a new Transformer based framework for medical VQA (named as Q2ATransformer), which integrates the advantages of both the classification and the generation approaches and provides a unified treatment for the close-end and open-end questions. Specifically, we introduce an additional Transformer decoder with a set of learnable candidate answer embeddings to query the existence of each answer class to a given image-question pair. Through the Transformer attention, the candidate answer embeddings interact with the fused features of the image-question pair to make the decision. In this way, despite being a classification-based approach, our method provides a mechanism to interact with the answer information for prediction like the generation-based approaches. On the other hand, by classification, we mitigate the task difficulty by reducing the search space of answers. Our method achieves new state-of-the-art performance on two medical VQA benchmarks. Especially, for the open-end questions, we achieve 79.19% on VQA-RAD and 54.85% on PathVQA, with 16.09% and 41.45% absolute improvements, respectively
    • …
    corecore